Goto

Collaborating Authors

 human moral judgment



When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Neural Information Processing Systems

AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind -- the ability to determine when a rule should be broken, especially in novel or unusual situations. In this paper, we present a novel challenge set consisting of moral exception question answering (MoralExceptQA) of cases that involve potentially permissible moral exceptions - inspired by recent moral psychology studies. Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MoralCoT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments. MoralCoT outperforms seven existing LLMs by 6.2% F1, suggesting that modeling human reasoning might be necessary to capture the flexibility of the human moral mind. We also conduct a detailed error analysis to suggest directions for future work to improve AI safety using MoralExceptQA.



When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Neural Information Processing Systems

AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind -- the ability to determine when a rule should be broken, especially in novel or unusual situations. In this paper, we present a novel challenge set consisting of moral exception question answering (MoralExceptQA) of cases that involve potentially permissible moral exceptions – inspired by recent moral psychology studies.


When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Jin, Zhijing, Levine, Sydney, Gonzalez, Fernando, Kamal, Ojasv, Sap, Maarten, Sachan, Mrinmaya, Mihalcea, Rada, Tenenbaum, Josh, Schölkopf, Bernhard

arXiv.org Artificial Intelligence

AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind -- the ability to determine when a rule should be broken, especially in novel or unusual situations. In this paper, we present a novel challenge set consisting of rule-breaking question answering (RBQA) of cases that involve potentially permissible rule-breaking -- inspired by recent moral psychology studies. Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MORALCOT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments. MORALCOT outperforms seven existing LLMs by 6.2% F1, suggesting that modeling human reasoning might be necessary to capture the flexibility of the human moral mind. We also conduct a detailed error analysis to suggest directions for future work to improve AI safety using RBQA. Our data is open-sourced at https://huggingface.co/datasets/feradauto/MoralExceptQA and code at https://github.com/feradauto/MoralCoT


There's Still Work to Do Addressing Ethics in Autonomous Vehicles

#artificialintelligence

There's a fairly large flaw in the way that programmers are currently addressing ethical concerns related to artificial intelligence (AI) and autonomous vehicles (AVs). Namely, existing approaches don't account for the fact that people might try to use the AVs to do something bad. For example, imagine that there is an autonomous vehicle with no passengers and it is about to crash into a car containing five people. It can avoid the collision by swerving out of the road, but it would then hit a pedestrian. Most discussions of ethics in this scenario focus on whether the autonomous vehicle's AI should be selfish (protecting the vehicle and its cargo) or utilitarian (choosing the action that harms the fewest people). But that either/or approach to ethics can raise problems of its own, according to Veljko Dubljević, an assistant professor in the Science, Technology & Society program at North Carolina State University.


AI for self-driving cars doesn't account for crime - Futurity

#artificialintelligence

You are free to share this article under the Attribution 4.0 International license. Existing approaches to artificial intelligence for self-driving cars don't account for the fact that people might try to use the autonomous vehicles to do something bad, researchers report. For example, let's say that there is an autonomous vehicle with no passengers and it's about to crash into a car containing five people. It can avoid the collision by swerving out of the road, but it would then hit a pedestrian. "…the simplistic approach currently being used to address ethical considerations in AI and autonomous vehicles doesn't account for malicious intent. Most discussions of ethics in this scenario focus on whether the autonomous vehicle's AI should be selfish (protecting the vehicle and its cargo) or utilitarian (choosing the action that harms the fewest people). But that either/or approach to ethics can raise problems of its own. "Current approaches to ethics and autonomous vehicles are a dangerous oversimplification--moral judgment is more complex than that," says Veljko Dubljević, an assistant professor in the Science, Technology & Society (STS) program at North Carolina State University and author of a paper outlining this problem and a possible path forward. "For example, what if the five people in the car are terrorists?


AI for self-driving cars doesn't account for crime -- GCN

#artificialintelligence

Existing approaches to artificial intelligence for self-driving cars don't account for the fact that people might try to use the autonomous vehicles to do something bad, researchers report. For example, let's say that there is an autonomous vehicle with no passengers and it's about to crash into a car containing five people. It can avoid the collision by swerving out of the road, but it would then hit a pedestrian. Most discussions of ethics in this scenario focus on whether the autonomous vehicle's AI should be selfish (protecting the vehicle and its cargo) or utilitarian (choosing the action that harms the fewest people). But that either/or approach to ethics can raise problems of its own.